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The gradient method for solving two-point boundary- 
value problems is discussed and a modification of the 
gradient direction is proposed. An algorithm for 
efficiently determining the step size is also derived. 
Analytic and numerical examples illustrating the efficien- 


cy of the method are presented. 
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1. THE PROBLEM AND NOTATION ' 

The objective of controlling a physical system is to 
make the system function in the most desirable manner. If 
the system happens to be a production machine in a factory, 
the application of the best control will yield the maximum 
machine output for a specified expenditure of raw materi- 
als, power, and production time. Perhaps the system is an 
aircraft which is to be automatically controlled during 
landing. In this situation the best choice of control 
might be expected to minimize deviations from a specified 
flight path. 

The system, or plant to be controlled, is assumed to 


be in the state variable form: 


x(t) i= a(x(t) ,u(t) ,t) (1) 
where 
xX =| x, | a (n x 1) vector u= u, |a (m x 1) vector 
a) Us 
x, u, 


The effectiveness of a controlled process at accom- 
plishing an assigned task is measured by a functional 
called the performance index or performance measure, J(u). 


The performance measure is assumed to be of the form: 


“f 

J(u) = h [x(t ,) te | ii | g(x(t),u(t),t) dt (2) 
t 

O 


The optimal control is the control which minimizes 
the performance messure. 

There are two general approaches that can be followed 
when computing the optimal control. The first of these, 
dynamic programming, reduces the problem to ene of making 
a finite number of optimal decisions starting at t = te 
and working backwards in time. Decisions are based on the 


(1), (2) who has 


"principle of optimality" due to Bellman 
used the method extensively. While efficient, this method 
suffers from the requirement of large amounts of computer 
stonage. 

The alternative to dynamic programming lies in varia~ 
tional calculus. In this method, the performance measure 


is augmented by Lagranye multipiilers and the functional 


oe 
pf 

J,(a) = (uw) 4 | p(t) (alatt),u(t),t)-K(t))dt (3) 
t | 


Or 


is obtained. The superscript T denotes the transpose of a 
vector or matrix. p denotes the cestate (or adjoint) vec- 
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fae and soe or Gamension nx iL. 


th 


When the state equations are satisfied, as they mist be, 
J = J since the inteyrand or (3) vanishes. 

Since a minimum cf the augmented functional is sought, 
the Waridti.on GF J, (denoted 61) is obtained and the fun- 
damental theorem of variational calenvlus is employed fo 
derive necessary conditicns whicl: must be satisfied by a 


candidate for an optimal solution. 


It is convenient at this point to introduce a func- 


tion, KH, the Hamiltonian, which is jefined as: 


K(x (t) sue rp (t) ,t) = g (x(t) ju(t) ,t)+p7 (t)[ a(x(t) sw(t) | 


(4) 


Using this notation, the necessary conditions for op- 


timality are: tgee 3) J 


X(t) = ¥, K(x" (t),u (t)-p (t),t) (5) 
p (t) = - Ye K(x" (t),u (t),p (t),t) (6) 
oe Soy ea2 ya ee! SS (7) 
and 
Ly, nd” “(te) | 


+ [H(x" (2) -u" (t-) P (te) +t _) + = n(x” (t,),t,) | 6t. = 0 


The star superscript denotes the optimal solution and vx 


denotes the matrix| @x 
al 


ov, 
aan ee 


Ox 
n 


bx, is the final variation of the state vector and bt. is 


the variation of the final time. 


The initial values of the states are assumed to be 
knewn and the relationships between the tinal conditions 
of the states and costates are given by equation (8). 
Thus the problem is a non linear two-point boundary-value 
problem. 

Many proposals for solving this problem have been 
made to date. Among these methods are the gradient tech- 


(4), (5), 6) | (7) 


hique variation of extremals , and the meth- 
od of quasilinearization (generalized Newton-Raphson meth- 
aay 2? . All of these methods involve an initial guess 
followed by an iterative procedure designed to calculate 
the optimal solution. Variation of extremals requires 

the designer to select initial values of the cestate equa- 
tions (p(t.)) and iterate until all final boundary condi-~ 
tions are satisfied. Quasilinearization demands an ini- 
Fial selection of the state and costate trajectories ard 
the iterative process contintes unkii the difference be- 
tween successive trajectecries becomes sufficiently small. 
Both of these methods may penalize the designer whe makes 
an unfortunate initial quess by nonconvergence; However, 


convergence, when achieved, is swift. A general discussion 


of the gradient method is the tupic of Seciion 2, 


2. THE GRADIENT METHOD 
In this section the gradient method is presented. 
The difficulties associated with the method, as well as 
its attributes, are discussed. 


2.1 Description of the Gradient Method 


Consider the systen: 
G8) = as), u(t) ,t) (1) 


and the performance measure: 


s(u) = hlx(t,),t,| + J ocxtt) a(t) eat (2) 
O 
which is to be minimized. 

The necessary condition for a solution to be optimal 
is that the variation 65 (u ) be zero, for problems with 
no constraints imposed on the control. Taking the varia- 
tion of the augmented functional J. and equating to zero 
results in the necessary conditions stated in equations 
(5),(6),(7), and (8). 

The gradient method is started by selecting a trial 
control history uy) and satisfying equations (5),(6), and 


(8). The following equation then represents the entire 


performance measure variation. 


br(u ()) = Ig M(x) (ty) ua) (e),p™ (ey, t ) bu (t) dt 


(9) 


By choosing 6u(t) correctly, a lower value of the perform- 


ance measure can be found. The process 2s tin weecated 


with the new trial control, yftth) | This control is selec- 


ted according to the relation: 


uP) (ey = a) (ey - Boo sec) (ey) (ey p™ (eye) 


If the procedure converges on the nth iteration, the result 
is y (9) (ey =u (t). 

Since the equations in all but a very few trivial 
problems are complicated (and usually non linear) and so- 
lutions in closed form are not readily available, the sets 
of differential equations are integrated numerically on a 
digital computer. Then x(t) and p(t) are not known for 
all values of t, but are available only at a discrete set 
of times, t,, where t,,, = t, + At, with At being the in- 
tegration step size. By Similar arguments, the choice of 
control histories is restricted to some subset of U (the 
set of all admissible control histories) which is defined 
at the set of times (t.) available during the computation. 

A practical choice is to make u(t) a member of a set, 
called arbitrarily 9, of all piecewise constant functions. 
That is u(t) = u,,te|t,,t,+ At | sf 

All other classes of functions are approximated by 
for the limiting case as At ~ 0. The member of the set 2 
chosen is denoted OQ, the subscript denoting the number of 
subintervals in the control interval Peta de The maximum 


mumber of precewisesconstant controls in the control in- 
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terval is determined by the integration step size and is 
c.-t 


calculated from L -=-° : 
65 (u'*) ) = 0 implies either 


a. Vi Cel) (yu (ey,p | (ty,t) = 0, tele. t, | 


or 


= 
£ r : 
c. J. [97x (x2) (ey a2) (typ 2) (tye) bu(t) lat = 0 
O 


(10) 

Case b above represents the trivial solution and is 
of no value. Of the non trivial choices, case a above is 
immediately appealing as it is identical to equation (7) 
of the set of necessary conditions. But, since u(t) is re- 
stricted to the set a (the price of numerical computa- 
tion), 63 (ut) can only equal zero if the state and costate 
vectors also happen to be piecewise constant or if the op- 
timal control happens to be piecewise constant. For this 
case there is no clear cut strategy to improve the control 
history and therefore proceed, eventually, to the optimal 
control. Present gradient techniques resort to satisfying 
the requirements dictated by equation (7) at the discrete 
points where all required information is available and use 


this data in the selection of a "better" control history. 
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This technique is satisfactory initially, but yields poor 
results as the optimum is approached. 

Note should be made here that the gradient algorithm 
can only approach the true optimum as the integration step 
size At approaches zero. This is caused by demanding that 
u(t) be a member of the set Q.. Tt is not felt; however, 
that the error will be of great concern in problems of 
engineering interest. 

2.2 Motivation for Selection of the Gradient Method 

The gradient method, while suffering from difficulties 
yet to be discussed, has some appealing attributes. 

The first favorable characteristic is the variable to 
be guessed, which in this method is the control history. 
The designer is much more likely to have insight into a 
suitable control history than either the remaining unknown 
boundary conditions required by variation of extremals or 
the trajectories required by quasilinearization. Fora 
stable system, the initial guess at a control might well 
be to "do nothing” and guess uw!) ce) = On te lt aera 
Equally important is the fact that the initial guess is 
not usually crucial to the success of the method. 

Another feature of the gradient method is its rela- 
tive ease of programming. For a stable system all inte- 
grations are numerically stable, as the stable state equa- 
tions are integrated forward in time while the unstable 
costate equations are integrated backwards in time. All 


equations are relatively easy to derive. A flow diagram 
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than the termination criterion? 







Stop 
No 
a2) (ey =u) (t)- Bx ok 


Figure l. 
Flow diagram of the Gradient Algorithm 
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of the algorithm is shown in Figure l. 
2.3 Difficulties Encountered with the Gradient Method 
As mentioned above, the gradient method is plagued 
with some rather poor qualities, apparently the price of 
forgiveness in the selection of the initial guess. These 
problems are difficulties involved in selecting the step 
size (8) and slow convergence in the vicinity of the 
optimum. A proposed change in the selection of a gra- 


dient is presented in Section 3. 
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‘3. $A MODIFICATION TO THE GRADIENT 
An alternative solution to the dilemma of finding a 
strategy that will converge to the optimum choice of con- 
trol (in the set Q) is to restrict 6u(t) to the set Q. 
also and base a strategy on case c (equation (10)) of sec- 
tion 2.1. With 6u(t) so restricted equation (9) can be 


rewritten as: 


| L t.+At } | 
63 (u") ) = ) ou" | } v (x4) (t) wu) (4) a (t) t) dt 
= 2° 


J 
(1A) 
and the equivalent of case c is obtained by setting 
4c 
67(u ) = 0. Since bu. is not zero (except in the trivial 


case), the remaining condition for optimality can be im- 
mediately written as: 


t.+At 
* * * 
| v(x" (t),u"(t),p (t),t) at = 0 (12) 
i u 
J a 
5° = O,1,2,...0m 


For a scalar control equation (12) may be expressed as: 


t.+tAt 

5 ” ” 7 must 

am. (x*(t),u"(t),p(t),t) at = 0 (13) 
J 

| =6@.,1,20..4& 
Let 

t .+At 

ty vx (x(t) u(t) p(t) ,t) di. F x(t) ,u(t) -p(t) ) 
J 


With u restricted to the set a. this functional may be 


written as P = Pag (Xs) Ug ee ee Uaees +s Uy eP)- From this 


i 


an expression for ou. 
interval is sought. 
may be obtained by expanding P 


jectory. 


theseconbredevarilation in eae 7th 


An approximation to this variation 


about the reference tra- 


BF (x+6x, u, tou, -uytduy5, eco 7. + ou. eee »u,+5u,,ptdp) 


OF. OF. 
oe , oe 
— Pa (Xo Ugee+-sUs,-+- ssp) t ax x 
OF. OF. OF. OF. 
+ 3a, du,t coe + aa, oa coe + oa du, + ap dp 


+ higher order terms 


The terms containing first 
prise the first variation of P 


Of interest is the value of oe 


partial derivatives com- 
and will be denoted OF, . 


caused by the perturbation 


Of them@yen= control. 

Ast bu, = Ol x 3 

Then 

OF. OF. OF. 
beep 6x 5 bu. a 6 
y tT Vs J me 
OF. OF. 
Assuming thet o 5x and s— 6p are small enough to be 


neglected leads to a first order approximation of OF.. 


OF. 
oF = Ba, 845 
J 
Then 


nd 
rd 


Eg (x+6X,U, U5, ie -su,+bu., -2-/U, pt Sp ) 


OFS 


Se A PS «Te au, uj 


gos. oa,p) # 


For Ps in some neighborhood of zero, Us must lie in some 
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* 
neighborhood of om: ™ 
Let 
Pa (X4+0%,.Uy,Up,--.,U,+6u5,...,U,,pt+dp) 
— EF hs * 
= 5 eet ye eee mp ovonenn ly, wid ) =0 


Then 


Be Aoes Ub yael one oee er e 7Uz PD). 


3 | 
a Su, BP. (Xp, Uy Us sere — eee ,Uy +P) bu. = 6 


EF. (X,U,,U5, e oo ey ee - Uz, /P) 


and Ouas = - ge meie S  m  e (14) 
uy = (X,U,, Ugr--- am, — Us-P) 


If the control terms of the system state equations 
are no greater than quadratic and the control terms of the 
performance measure are quadratic, the denominator of equa-— 
tion (14) is. a constant. For this case a nominal value of 
Su... = -F, (XpUy rece Uz oP) /At may be employed. 

In the notation of the problem this modified express- 
ion is written as: 


eae 
. 1 , _ 
Oe = —- Be V (x(t) -u(t) »p(t) ,t) dt (15) 


This expression indicates that an integral mean value stra- 
tegy is employed to determine oh, 

The time history, or collection of all the Suis is 
taken to be the gradient of s(u't)) with respect to the 


control, in the.sense that adjusting the controls locally 


lig’ 


in the direction of the gradient will result in the great- 
est change in the performance measure. 

Use of this technique leads to a modified control gra- 
dient which is employed with the standard gradient algo- 
rithm. If the modification is valid, the resulting gra- 
dient should possesscertain properties and eventually lead 
to the optimum choice of control. A general discussion of 


these properties is presented in Section 4. 
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4. A TEST OF METHOD EFFICIENCY 

The analogy between the gradient method and a hill 
climbing procedure is presented. Two possible climbing 
techniques are stated, one of which generates orthogonal 
gradient vectors. A comparison of problems solved by the 
standard and improved gradient methods is made. 

4.1 The Gradient Method as a Hill Climbing Procedure 

The gradient method for the solution of optimal con- 
trol problem is often compared to the hypothetical problem 
of a survey party attempting to climb to the top of a 
mountain (in our case descend to the floor ofavalley) ina 
dense fog. It is desirable to reach the unknown and un- 
seen summit in a minimum amount of time. 

The head surveyor is faced with two likely strategies 
by which he may achieve the summit. He can take a local 
survey, move a few feet in the direction of steepest as- 
cent, and then take another local survey to modify his di- 
rection. He may be reasonably assured that by use of this 
technique the shortest distance between his starting point 
and the top of the mountain will be covered. Unfortunate- 
ly, a very high number of surveys will probably be neces- 
sary to maintain position on the path with the ensuing haz- 
ard of much "wasted" time. 

An alternative to this strategy is to make a local 
survey and then proceed to climb in the indicated direction 
until no further ascent is possible. At this time another 


survey is made and the climb resumes in the new direction 


is 


determined. This approach is likely to deviate from the 
path of steepest ascent traversed before and runs the risk 
of running into plateaus which would give a false illusion 
of being at the summit. Its main virtue is the likelihood 
that fewer surveys will be required and, hopefully, the 
time of ascent will be diminished. 

When following the second strategy, it is noticed 
that successive gradients will always be orthogonal. If 
this was not true, a further improvement could have been 
achieved by continuing to climb in the direction of the 
previous gradient. 

The problem described above also occurs when using 
the gradient technique to compute an optimum control. The 
optimal control is the one which lies in the mathematical 
valley defined by the state equations of the system and 
the structure of the selected performance measure. Like 
the surveyor, the control designer can only observe local 
conditions and must use this data in his optimization pro- 
cedure. 

If a strategy similar to the alternative plan proposed 
above is followed, the computation of the optimal control 
proceeds as follows: 

A control is selected and the gradient is calculated, 
establishing a direction of search. A number of trials 
are made to determine the minimum value of the performance 
measure that can be found by moving along this gradient. 


At the minimum value point a new gradient is computed and 
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the process is repeated. When no improvement can be at- 
tained in any direction, the process is assumed to have 
converged to a local minimum. 

Using this type of procedure, if an accurate search 
for a minimum is performed, and if the gradient is correc- 
tly evaluated, successive gradients will be orthogonal. 
4.2 Comparison of Standard and Modified Gradient Results 

Problems computed using the standard gradient tech- 
niques approximate the orthogonality property in the ini- 
tial stages, but as the optimum is approached any approx- 
imation to orthogonality completely breaks down. The pro- 
cess finally "converges" with an indication that, while 
the norm of the gradient is sufficiently large, no further 
improvement can be made by any move in the indicated gra- 
dient direction. This leads to the conclusion that the 
true gradient has not been computed from the standard tech- 
niques. 

When the modified gradient is used with the same 
search procedure, successive gradients maintain the pro- 
perty of orthogonality throughout the problem computation 
and improvement is always achieved by moving along the 
indicated gradient. Convergence is smoother than that of 
the standard method and is usually achieved by the re- 
quanemente thatke the norm of the gradient be less than some 
small specified constant. The conclusion is that the 
modification leads to the computation of an improved gra- 


dient of J with respect to the control u(t). 


EAL 


5. TWO SINGLE VARIABLE SEARCH PROCEDURES 

Computation of the gradient establishes a direction 
in which a search for a minimum will prove fruitful. The 
size of the step to be made in this direction remains to 
be determined. Two single variable search procedures are 
presented by which this step size may be calculated. 
5.1 The Golden Section Search 

The Golden Section search described by Wilde (9) pro- 
vides an efficient means for finding the minimum of an 
unknown, but assumed unimodal, function, once the minimum 
is known to lie in a specified closed interval. The prin- 
Cipal properties of unimodality of interest are that for a 
unimodal function, f(x), only a single minimun, f(x )- 
exists in a closed interval, I, and for two points, a and 
b, both in the interval, and on the same side of the mini- 
mum, £(a) < £(b) if |x -al <|x -b|l. 


The method is described briefly below. 


Given: an interval of length Ly containing a single minimum 


ee fx) 


Find ; an interval of length Ly < Ly such that the minimum 


of f(x) is contained in Ly 
A typical problem is shown in Figure 2. Let the lower 
bound of the interval be at a and the upper bound at 


1ssg2vem by Le= ba. 


ibe Then the interval length L 0 


O 
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Place an experiment in the interval at point cwan@ let 


L L 
L, = c-a. Point c is chosen such that — = _ One 
L Ei 
solution of this equation is 7 = => 1.68034. Call 
il 5-1 


thes mumiber T. Then c = atL, = atL)/T. Another experi- 
ment is placed symmetrically in the interval at a point 

e' = b-L)/T. 

The following decision is made on the basis of the experi- 


ments at points c and c'; 


* 
if fiic') < f(c) thema <s aac cid i 


1=¢-a 
% , 

if £4c') = £(c) them c’= = <= c ana Ly = c-c'! 
* 

if f(c') > £(c) then o"3x .<- c and Ixy = b=c" 


li 
Thus an interval L, < Ly containing the minimum has been 
generated due to the unimodality of f(x) and the placement 
of the experiments. 

The advantage of the method becomes apparent when 
fumther redcutions of ENE ™INterver are attempted, the ob= 
jective being to generate a sequence of decreasing inter- 
val tengths such that LL < y, where y denotes the stopping 
Criteumenwor the search: 

To generate L., experiments are placed in the reduced 
interval Ly in the same manner as the experiments were 
placed in the original interval. Assume that a < x <e 
and thus L, = c-a. Two new experiments are to be placed 
in this new interval at points d and d’. 

Then d = atL, /T enaloles (Ss) c-L,/T 
But c' = b-L,/T and, noting that L, = Lo/T. d = atLy/ 


24 


An express#zenfor ¢c'-d may be written as: 


Cie he Tk Pee = a 
ris 


Since one experiment already exists in the reduced inter- 
val L,, only one additional experiment must be performed 
to further reduce the interval. 

To apply this search method to the gradient algorithm 
the interval of the search must first be defined. It is 
already known that moving in the gradient direction will 
guarantee at least some improvement in the performance 
measure. A crude search is performed until the perform- 
ance measure is greater than the value found at the origin 
of the search. This is achieved by initially taking a 
unit step in the gradient direction and doubling subse- 
quent step sizes until this condition is met. The result- 
ant closed interval must then contain the minimum and can 
be reduced by means of the golden section search procedure. 
Table 1 shows the number of experiments required to reduce 
an interval of unit length to a desired length L.: 

Application of the golden section search and modified 
gradient technique to the problems run to date has result- 
ed in convergence ina very few iterations and established 
the validity of the modified gradient method. Many experi- 
ments are required per iteration, however; and so the 
total number of forward integrations that must be perform- 
ed is still rather high. A method to improve this short- 


coming is discussed next. 
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5.2 A Quadratic Approximation Search Technique 


Since the search for a minimum in the gradient direc- 
tion involves the single variable 8, J can be written as 
some unknown function of this variable J(8). From pre- 
vious calculations J(0) is known and <5 J(0) is known to 
be less than or equal to zero, with equality applying at 
the optimum solution of the main problem. 

An alternative solution to performing a golden section 
search is to, rather boldly, assume that J(8) is a quadra- 
tic of mie form J( fa) = a B? +b B+c and then to use the mini- 
mum of this quadratic as the optimum value of B. While 
there is no reason to believe that this should be as accu- 
rate as a search procedure, it requires far fewer experi- 
ments to achieve an estimate. 


Given a quadratic, f(x), and three points x, < x, <x 


il 2 
it can be shown that if f(x, ) > E(x.) and f(x.) > £(x,), 


oy 
x * 
then x. < x < x. where £(x ) = min £(x). 


1 3 
A quadratic may be fitted Peco a to any three points 
XpeXoe and X3 in the following manner. 
Let f(x) = Be JE where a, b, and c are unknown con- 
stants. 
Using the values of f(x) at the three known points 


the following equations may be written: 


=— ax 2 bx +c 


£(x,) 1 i 


2 
= aX. +bx.+c 


£ (x5) 2 2 


= zZ 
f(x.) ax, tbx., +c 
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or in matrix notation 


2 

jad ew 5 
- 2 

to Ame es 2 
2 

f, - 2 


The expression is 


coefficients. 


2 
a x) xX) 1 
= a 
b Xo Xo 1 
2 
Cc X3 X3 if 
The matrix Baie 
2 
#2 
a 


therefore its inverse always exists. 


easily solved for the vector of 


-lr 
sai 
= 
a 
x4 il is positive definite and 
Xo ji 
X3 1 


Thus the co- 


efficients a, b, and c can always be computed. 


The minimum of this quadratic is found by setting the 


derivative equal to zero. 


* 
This results in x = -b/2a. 


The three test points through which the quadratic is 


to be fitted are again found rather crudely. 


point is placed at the origin of the search. 


The first 


No experi- 


ment need be performed here as the value of the perform- 


ance measure 1S already known. 


A test is next performed 


with a step size of unit length and the results of this 


test determine the direction of further search along the 


gradient. 
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1. If the performance measure at this point is great- 
er than that at the origin the third test point 
is placed midway between the preceding test and 
the origin. Successive tests are made in this 
manner until the value of the performance measure 
is less than that found at the origin, and final- 
ly, the quadratic form is fitted to the three 
points closest to the origin. 

2. If the performance measure at the first test 
point is less than that at the origin, the step 
size is doubled and further tests are placed in 
this manner until the performance measure increas- 
es in value. The quadratic is then fitted through 
the three test points that are farthest from the 
Oran . 

Once the three points have been determined, the con- 

Stemi eco bsare easily evalyatedsamd the ,quadragic ap- 


* 
proximation of B generated. 


28 


6. AN ANALYTIC EXAMPLE 
A simple example will serve to illustrate some of 
the differences between the standard gradient method and 
the method utilizing the modified gradient computation. 
Consider the linear first order system: 
x(t) = -x(t) + u(t) 


and the performance measure: 


rE 


J = ile) - | 1/,u* (t) dt 
t 
O 


The following additional data is given. 


oi O, te = i, x (to) = Sees 4.0, x(t) unspecified 


Forming J. and taking the variation, the necessary 


conditions can be written. 


= * * * 
L. - fe = —e (tr) + @ ft) wth F.C. (to) = Xp 
» be Ge) o= pT we) 

* * 
S . wl (t) = 2% (t,) 

* * 
4. u (t) = pr (®) 


These equations may be solved simultaneously for the opti- 


mal solution. 


x (t) = xe -k/2 fet-e* | 
x t # 

p (t) = ke’, p (1) = 2x (1) 

a es) = ge" 


yn, 


with XQ = 4.0 
* G 
k =90 958063, u (t) = -0.58063e 
* * 
x (1) = OV78915, J = owen 
This will be considered as the reference solution. 
To exaggerate the difference between the two methods, 


the best control in the set @ will be calculated. With 


this choice of control, the following relationships are 


found . 
x (Ee x)e tu(1-e"*) 
p(t) = ke", k = 2e “*x(t,) 


v (x(t) ,u(t),p(t),t) = utke® 


The procedure is started by selecting an initial con- 
trol history and choosing the next control accordingste 
the method selected. If the integration is done numerical- 
iy. Ow etal computer with a step size of 1.0, the only 
values of x and p available are those at the end points of 
the interval. The initial guess of u(t) will be 0.0. 
Case I. Standard gradient procedures 
Su is chosen to satisfy equation (7) at discrete points in 


the interval. For an extreme case 6u is based on condi- 


tions existing at t = 0. 
Then sch) = ee! ok 
and yee ol igre 
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Applying this to the problem equations: 


x(1) = x a awe AN Gy 7 


0 
k= Pe 134i) 


y (it) - _k 


Elimination of variables results in a difference equation 
BM Us 


y ttl). ~0.271x,-0.469u ‘*) 


# 
which converges to u = -0.1845x,. 


Application of this control to the system results in 
& ~ x 
vw = -0.739, x (1) = 1.0065 JF => 205 


note that: u(t) + p(t) ea” -0.739+0.739 = 0 


Apparently the "gradient" is zero and the optimal control 
has been found. 

However, a direct search among controls in the set Q, 
results in 

wu = -1.034, x (1) = 0.817, J = 1.203 
obviously a better choice of control than that generated 
by the standard gradient method. 
Case II. Modified gradient method 


§6u is chosen to satisfy 


f , 
ety | | u(t) +p (t) lat+ou = 6 


O 


which is the integral expression in equation (15). 
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t 
: £ . 
Then 6u‘t) = - 3 J. } ut) (2) api? es), Jat 
O 


where At = t.-t 
i: Oo 


a | u'4) atsx(et-1) | 


and 


yet) 7 K-(e-1) 


After application to the problem equation set and 
manipulation, a new difference equation in u is obtained. 


y (Ath) @ -0.465x,-0.8u‘*) 


with solution 


* 
y= -0.258x, 


Application of this control to the system results in 

wu = -1.034, x (1) = 0.818, J = 1.203 
This is virtually the same result as found by direct 
search. 

Of the choices offered by the two methods, the modi- 
fied gradient (which is scalar for this problem) converged 
to the optimum solution for this type of control (u(t) €Q ). 

The control gradient for this pmblem is sketched in 
Figure 3. which shows conditions existing after the op- 
timum control has been found. Examination of this figure 
reveals that this is not the optimum solution when all 
admissible choices of control are considered. A suitable 


choice for an admissible 65u(t) is shown in the figure. 


6) 


. 602 









§6u(t) (typical choice 





nem = G=ee Ge em es = = 





-.432 


Figure 3 


Linear Regulator - Veet) vs. time at optimal 
solution for class Q, control. An admissible 


control variation is shown. 
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The resulting 6J is 0.256€. 


What we have found is the best control in the set Q, Since, 


fe 
tr x 4 ib 
E (t)+ p (t) Jat = -]1,034 + 602 le i = 0 
Se 
and thus the variation in J is zero for small changes in 
this class of control. 
The method is simply extended to other classes of con- 


Grol. If the gemerol is sellctea from «bhe wet Q,. 


to tAt 
6J = | [9 (x(t) ,a(t) p(t), t) | bert -ee 
Xo 
t.+At 
ik 
+ | [ 7, C(x () ,w(t) -p(t),t) |6u(t) dt 
TT 
Moen co) ; 
here oa —) eee _ t,+4t a— 0,1 
coo 
O 


bu(t) €[Q,] 
t.+At 
--%! 7 Jae. 3 - 
and ou. a “At F V pe(x(t) u(t), p(t) ,t) dt, J 0,1 
J 
When applied to the example problem, a set of differ- 


ence equations is obtained. 
(i+1) 
eal 


ice) _ > ie (1) 
5 = -0.583x) 0.360u, 0.642u, 


= -0.352x. -0.281u. ‘*? 20 /385u, ‘*? 


0 A 


u 
The solution of these equations is: 


* * 
Uy = -0.75/7, Uy = =1.255,.. J3er Xo = 4.0 
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When these controls are applied to the regulator the re- 


sults are: 
* * 
x (1) = 0.795, and J = 1.169 


The optimal solution presented above compares closely to 


the solution attained by a direct search of controls in 


the set Q,. 


Es) 


7. COMPUTER PROGRAM AND NUMERICAL RESULTS 
An explanation of the computational program is pre- 
sented. The section concludes with the results of problems 
computed using the standard and modified gradients. 
7.1 The Gradient Program 
A flow diagram of the otenene algorithm was presented 
in section 2. This formulation was followed in coding the 
computer programs following. The main program accomplishes 
all "bookkeeping" operations required and contains logic 
statements for problem termination. Various subroutines 
are called to perform computations and other auxiliary func- 
tions. The interconnection of these programs is shown in 
Figure 4. 
The principal subsections of the main program and the 
subroutines are as follows: 
A. The input section 
The following inputs are required. 
1. The state initial condition vector “KX, 
2. The initial and final times 
3. The allowable integration step size for 
numerical accuracy 
4. The number of control intervals in the 
problem 
5. The convergence criterion 
B. Problem initialization 
This section divides the problem time into the 


desired number of equal control intervals and 
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~ifain program- 


Innut section 


Problem 
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COMMONS tora Fe 


gxecessible by 
fir outiane & 





-Subroutines-~ 


Integration 
1.Forvward 

a(x,u,t) 

g(x,u,t) 


ee ——— ~~ C:CUC 


c.Backward 
a(x,u,t) 


mH 


eradient 
Equation 


Search routine 


Dra rs, 
anf Y fa 


gradient xsener2tor 


compares the length of an interval with the al- 
lowable integration step size. If the interval 
length is too great, it is divided into subinter- 
vals until the allowable integration step is 
greater than, or equal to, a subinterval length. 
The integration step size is then made equal toa 
subinterval and the control during an interval is 
made up of identical controls in contained subin- 
tervals. This maintains integration accuracy 
while allowing flexibility in the number of con- 
trol intervals selected. Other values which need 
be calculated only once for a particular problem 
are generated at this time and stored for future 
use. 

Evaluation of performance index and calculation 
of the gradient vector 

This process is achieved by calling an integration 
subroutine with a control history u(t). 
Termination decision 

Problems are terminated when the norm of the gra- 
dient vector becomes less than the preset conver- 
gence criterion. This norm is taken to be of the 
form 


L-1 t.+At 
lv sll= = ») Fal V (x(t) ,u(t),p(t) ,t)at | 


7— 1 


where L is the number of control intervals. 
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An alternate exit is made after a preset number 
of iterations. 

Search for improved control 

This search is performed by calling a subroutine 
of the desired search algorithm. This section 
returns to section C. 

Output variables of interest 

The control history and up to four state variable 
trajectories are tabulated and graphs of desired 
variables are plotted. This completes the compu- 
tation. 

Integration subroutine 

This routine integrates the state equations and 
integral terms of the performance measure forward 
in time when called with a given choice of con- 
trol. At the final time non integral terms of 
the performance measure are evaluated and the 
cost (g(a), is calculated. An exit is provided 
at this point when a new gradient is not to be 
computed. If the gradient is to be computed, the 
costate boundary conditions are evaluated (p(t,)), 
and the states, costates, and the gradient equa- 
tion are integrated backwards in time. During 
the integration, the integral mean value of the 
gradient is computed for each control interval 
and stored. When the initial time is reached,the 


gradient generator is called with the array of 
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mean values and exit is made to the calling pro- 
gram. 

Search subroutine 

This routine computes a new trial control y TD) (4) 
by placing experiments at a distance B along the 
gradient vector. An experimental control is gen- 
erated by choosing a value of B and calling the 
control generator. An experiment is made by cal- 
ling the integration subroutine with this experi- 
mental control and evaluating the performance 
measure. The best distance (§*) is computed ac- 
cording to whichever algorithm (golden section 
search or quadratic approximation) is selected. 
The new control is calculated by the control gen- 
erator and exit is made to the calling program. 
Gradient generator 

When called with a vector of parameters whose di- 
mension is equal to the number of control inter- 
vals, this subroutine generates a new vector 
whose dimension is the number of integration steps 
in the problem. All values of the newly generated 
vector ina control interval have the same value 
as the single parameter associated with that in- 
terval. 

Control generator 

This routine is called with a reference control, a 


reference gradient and a scalar multiplier. The 
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new control generated is the sum of the reference 
control and the product of the scalar and the 
gradient. 

This same program is modified to compute a solution 
according to the standard gradient technique by evaluating 
the gradient equation at the beginning of each control in- 
terval and using this vector as the gradient. 

7.2 A Linear Regulator 

The program was initially applied to the linear regu- 
lator problem of section 6. An integration step size of 
0.01 and an initial guess of control, ee) = 0.0 were 


used. The control history was selected from &), A gold- 


0° 
en section search with a convergence interval of 1x107> was 
employed to ensure accuracy in determining the best step 
sixe. The convergence criterion for the gradient norm was 
1x10~°. 

Both methods converged on the second iteration to a 
value of 1.16126633 for the performance measure. 

Results of the problems are shown in Table 2. 

The fortunate initial guess of control and the struc- 
ture of the problem enable both methods to achieve conver- 
gence after only two iterations. Of interest is the pri- 
mary indicator of convergence, the norm of the gradient 
vector, which has indicated convergence with the modified 
method but has failed to achieve even the same order of 


magnitude using standard techniques. Both of these compu- 


tations incorporated an exit when the difference between 
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Gr edient 
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eid doe to W Ade 5370 7 COON ao UO. dee. 1 | 
ro cOobOe GC, QOL 7 C7 0.072949 


ime Ou OOoW 0,00158204 


O74.45974144 : 9.556446 


0.000082466 . Oe iS) Si, 


0, 0008001 5 ie: 
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successive costs was less than ne . 


Another computation was made with this cost exit re- 
moved and an initial guess of -1.0 for the control. A max- 
imum of 20 iterations was permitted. The results of this 
test are shown in Table 3. 

While the initial guess of -1.0 for control is closer 
to being optimal than the guess of 0.0 used previously, 
both methods are in a poorer starting position computation- 
ally. The standard gradient technique fails to achieve 
convergence after 20 iterations (the maximum allowed). Of 
primary interest are the gradient angular change and the 
step size, B™. While the standard gradient is roughly 
orthogonal initially, as the optimum is approached succes- 
Sive gradients are very nearly identical; moreover, the 
step size drops to the minimum value the search algorithm 
can generate. A gradient supposedly exists but search 
along this gradient fails to disclose a better control. 

The modified gradient computation, on the other hand, 
indicates a rather smooth convergence to the gradient norm 
cutoff and the orthogonality of successive gradients is 
maintained throughout the computation. The step size, B’, 
exhibits a tendency to form a pattern and, more important- 
ly, 2mpxowves the ‘control at each iteration. It is conclu- 
aed, thererore, that the method has talculatead the gra 
dient accurately. 

The optimal value of the performance measure generated 


by the modified method compares favorably with the true 
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optimal value of 1.16126214 computed by digital evaluation 
of the analytic solution of the problem. The error arises 
from the piecewise constant control restriction. 
7.3 Continuous Stirred Tank Reactor 

A more ambitious undertaking is the solution of the 
continuous stirred tank reactor problem posed by Lapidus 
(10) 


and Luus 


The state equations of the system are: 


25x, (t) 
x, (t) = ~2 (x, (t)+0.25) +(x, (t) +0.5) exp en 
~(x, (€)+0.25) u(t) 
25x, (t) 
x5(t) = 0.5 - x5(t) - (x, (t)+0.5)exp x, (t) +2 


and the performance measure to be minimized is: 
t 
ga” 2 2 
J = [x (t)+ x, (t) + O.lu (e) | dt 
BI 2 
t 
O 
Forming the augmented performance measure and utiliz- 
ing the necessary conditions results in the formulation of 


the costate equations and gradient equation. 


b,(t) = -2x, (t)+2p, () 


= 1 (t) 
50 
=p, (tC) (x. 46) +8. S|) | € x 
Py 2 aul Ae) 


25x, (t) 


50 
t aa(te).+p, (tc ) t )+0. 5)| ————s Jexel 
+p, ( )u( a(t) o( ea 7(t)#2 ) 
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725K { t@) 25x, (t) 
SP eter (08 “py (t) exp Sey + pa (t)| rex 5a a) 


p,(t,) os Po (ty) = ao 
v(x (t) u(t) p(t), t) = 0.2u(t)-p, (t)[ x, (t)+ 0.25 | 


The other given data are: 


to = 0.0, t. = 0.78 


X19 = O05, Xoq = OZOF x(t) unspecified 


The control to be generated is the flow of coolant 
through a coil immersed in the reactor. x, (t) is the de- 
Viation from the desired steady state temperature and 
x5 (t) is the deviation from the desired steady state che- 
mical concentration in the reactor. 

Solution of this problem using variation of extremals 
indicates the optimum value of the performance measure to 
be .02660336. As in the linear regulator problem, the ini- 
tial investigation 1S a comparison between the standard 
gradient method and the modified technique. The step size 
is determined by a golden section search. | 

Three exits were provided 
IW” Worm of gradient less than 1x107/ 
2. Improvement between successive iterations less 

than 1x10~° 
3. 20 iterations have been completed 
The integration step size selected was 0.01 and 78 


By ieas 


control intervals were provided. was chosen as 
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0.0. The results of the investigation are shown in Table 4. 

From this data, the principal advantage of the modi- 
fied gradient method is the orthogonality maintained by 
successive gradients, reinforcing the earlier conclusion 
that this method is computing an improved control gradient. 
Both methods terminated when the difference between succes- 
Sive evaluations of the performance measure was sufficient- 
ly small, but the standard gradient terminated more slowly 
and to a greater value. 

With this exit removed, the modified gradient conver- 
ged to the desired gradient norm on the ninth iteration. 
The value of J computed was 0.2660936. Successive gradi- 
ents maintained the orthogonality property throughout the 
process. 

Under the same conditions, the standard gradient meth- 
od failed to converge after 20 iterations. Successive gra- 
dients became less and less orthogonal as the optimum was 
approached and the algorithm began to break down after 16 
iterations. The minimum value of J computed was 0.0266120L 

Some further insight into the convergence tendencies 
of the two methods is gained by graphically comparing the 
logarithms of the gradient norms and the normalized errors 


in the performance measure. This latter quantity is gener- 
; * 
(i) _ gG)- 5 
=—_ | 
J 
using variation of extremals. These results are shown in 


* 
ated as € , where J is the value computed 


Figures 5 and 6 respectively. 


The modified gradient technique converges smoothly to 
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the optimum value for the class of control while the stand- 
ard technique behaves somewhat erratically and converges 
more slowly once values "near" the optimum are attained. 

The logarithm of the gradient norm exhibits a defin- 
ite pattern of convergence with the modified gradient and 
is again rather erratic under the standard method. Ina 
further computation using the modified gradient technique, 
the pattern displayed in Figure 5 was maintained up to a 
convergence criterion of isctame This computation con- 
verged after 19 iterations although no measurable improve- 
ment in the value of the performance measure was attained 
beyond the ninth iteration value of 0.02660936. 

All of the above investigations involved golden sec- 
tion searches in the determination of the correct step 
size and, while convergence using the modified gradient 
method is achieved in relatively few iterations, the number 
of experiments performed in the search is rather high. A 
total of 195 experiments were performed by the modified 
gradient technique in computing the solution of the pre- 
sently considered problem. 

The next investigation utilized the quadratic approx- 
imation method to determine a step size which gives effi- 
cient improvement with relatively few experiments. While 
the number of iterations may be expected to increase, it 
is expected that the total number of experiments will be 
decreased. 
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search is shown in Figures 7 and 8. 

The number of experiments made an the coneue asmensasy 
markedly reduced (to 57) by using the quadratic approxima- 
tion, and, while poorer initially than the golden section, 
compares quite favorably with that method as the optimum 
is approached. Successive gradients using the quadratic 
approximation behave erratically during the initial iter- 
ations, but become nearly orthogonal as the optimum is ap- 
proached. Computation time is sharply decreased as expec- 
ted from the reduced number of experiments required when 


using the quadratic approximation. 
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8.. CONCLUSIONS 
The problem of computing a control which minimizes a 
selected performance measure has been stated and a number 
of proposed methods for solution have been presented. 
Motivation for the use of the gradient algorithm has 
been: discussed and a method of improving the convergence 
of this procedure utilizing integral mean values of the 
gradient equation has been derived. Finally, an improved 
method of determining an approximate gradient step size has 
been presented. 
The selection: of: 
t.+At 
m3 K, | “[v,3e(e(t) -u(t) ,p(t) -t) Jat 
Sa 
is similar to the choice made by Hasdorff and Gupta 4) 
who applied the method to the solution of sampled data 
systems. 
Further reseach is required in the evaluation of: 


t.tAt 
aoe F.( yas2-| “Lo se(xe(t) u(t) p(t) .t)] at 
au. 5 Rr Unece ry oP ~8u, ' Pe ’ 0D ’ 


: j 
which may prove useful in determining a more accurate value 
for 6u.. Efforts in this direction are felt likely to 
prove fruitful in achieving further improvements in the 


technique. 


eo) 
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